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Abstract 

Multilevel selection and the evolution of cooperation are fun- 
damental to the formation of higher-level organisation and the 
evolution of biocomplexity, but such notions are controver- 
sial and poorly understood in natural populations. The theo- 
retic principles of group selection are well developed in ide- 
alised models where a population is neatly divided into mul- 
tiple semi-isolated sub-populations. But since such models 
can be explained by individual selection given the localised 
frequency-dependent effects involved, some argue that the 
group selection concepts offered are, even in the idealised 
case, redundant and that in natural conditions where groups 
are not well-defined that a group selection framework is en- 
tirely inapplicable. This does not necessarily mean, however, 
that a natural population is not subject to some interesting lo- 
calised frequency-dependent effects - but how could we for- 
mally quantify this under realistic conditions? Here we fo- 
cus on the presence of a Simpson's Paradox where, although 
the local proportion of cooperators decreases at all locations, 
the global proportion of cooperators increases. We illustrate 
this principle in a simple individual-based model of bacte- 
rial biofilm growth and discuss various complicating factors 
in moving from theory to practice of measuring group selec- 
tion. 



Group selection in theory and practice 

Some argue that the theoretic principles of group se- 
lection are well developed and cruc ial for understand- 
ing evolution i n na tural populations dWilson and Wilson , 
20071: lOkashal l2006h . Indeed, many artificial life mod- 
els seeking to explain the evolution of cooperation make 
either ex plicit or implicit reference t o group-level selec- 
tion (e.g., Scogings and Hawickll2008l iGoldsby et ail 2009: 
Wu and Banzhafll2009l) . The group selection position, how- 
ever, suffers from at least two serious problems. The first 
is whether the phenomena involved, though undisputed, 
formally require group selection concepts. The second is 
whether the idealised conditions they assume are applica- 
ble in natural populations. We briefly overview the standard 
model of multilevel selection and discuss these limitations. 
Our aim is to devise a practical theoretical approach to as- 
sess whether something interesting is happening in a nat- 
ural population with respect to the scale of selection. As 



a practical exemplar, we have in mind the possibility of 
group selection occurring within natural bacterial biofilms. 
Biofilms are formed when bacteria attach to a surface and 
develop into dense aggregations, and they are in fact the 
most common mode of bacterial growth (compared to well- 
mixed planktonic populations). Bacteria living in biofilms 
are known to engage in many cooperative interactions, in- 
cluding the sharing of various 'public goods' such as extra- 
cellular enzymes. Biofilms also exhibit collective properties, 
such as anti-biotic resistance, that are significantly different 
from those of free-living bacteria ( Ghannoum and O' Toole , 
2004). Accordingly, they have potential to serve as an ideal 
model emp irical system for s tudying the transition to multi- 
cellularity (IPenn et all 12008 ). However to do so, we need 
to be able to connect idealised models of multilevel se- 
lection (for example, where groups are discrete and non- 
overlapping) with real-world biological systems (where the 
"groups" may simply be local neighbourhoods with no dis- 
crete boundary). In this paper, we discuss the theoretic and 
practical issues involved in studying multilevel selection in 
biofilms and other natural populations. We illustrate our dis- 
cussion with a simple individual-based model of bacterial 
growth, in which growth rate depends upon the local con- 
centration of a 'public good' that is costly to produce. As 
such, this system might be expected to fit standard theory 
on the evolution of cooperation. However in our individual- 
based model, as in many real-world cases, the groups are not 
discrete and so it is not immediately obvious how, if at all, a 
multilevel selection framework can be useful. How, for ex- 
ample, can we measure the relative strengths of within- and 
between-group selection if the groups do not have discrete 
boundaries? 

Despite this practical difficulty, theoretical and philo- 
sophical work suggests that multiple scal es of se l ection 
should still be present in such systems dWilsonl 119801: 



Sober and Wilson, 1998; Nowak and Mav 



1992b. 



Here, we 



illustrate the use of Sim pson's Paradox dSimpsoiu 11951 
Sober and Wilsonl [1998) as a quantifiable indicator of a 
group-level selection effect. Crucially, we illustrate that this 
need not rely on a priori knowledge of the exact group struc- 



ture, or even on the presence of discrete group boundaries. 
A Simpson's paradox occurs when, although the proportion 
of cooperators decreases in every locality, the global pro- 
portion of cooperators nevertheless increases. This can be 
measured in situ and does not require comparison with a 
well-mixed population, nor that we know the exact evolu- 
tionary game (fitness function) that individuals are engaged 
in. Then, by measuring the magnitude of the discrepancy 
between local and global proportions of cooperators over a 
range of local scales, we can identify the effective selective 
scale in a natural population. We also illustrate several fur- 
ther complicating factors that arise in moving from idealised 
theoretic models to more realistic biological scenarios. 

The idealised model of multilevel selection and 
its limitations 

The idealised model of multilevel selection involves a 
population of individuals that is divide d into d iscrete 
(equal-sized) sub-po pulations or demes dWilson , 1980; 
Sober and Wilsonll 1998b . Fig. 1. 



t=1 



t=2 




Figure 1 : Growth of cooperators (green) & selfish individ- 
uals (red) living in groups. Individuals in each group (only 
two are depicted) are drawn randomly from a global popu- 
lation (left) such that the proportions of types (cooperators 
and defectors) varies slightly between groups. Groups with 
more cooperators grow more than groups with fewer cooper- 
ators and therefore contribute more individuals (specifically 
cooperators) to the global cell-count. Hence, the global pro- 
portion of cooperators increases (right). 

Note this model assumes that localised fitness interactions 
are contained within neatly circumscribed groups. To sus- 
tain cooperation at high levels the population must be sub- 
ject to multiple episodes of 'aggregation and dispersal', al- 
ternating between phases with a single 'migrant pool' (the 
global population or a representative sample thereof), and 
phases with multiple localised interaction groups. Without 
a group mixing stage, selfish behaviour would eventually go 
to fixation within each group founded by one or more self- 
ish individu als (assuming Prisoner's Dilemma cooperative 
interactions: |Powers et al.l (2008): Powers (2Q1(J)). 



Is this really group selection? It has been widely ar- 
gued that this classic model shows nothing more than in- 
divid ual selection given localised frequency-dependent ef- 
fects dMavnard Smithl[l9^lNunneviri985l:lsteremvlll996t 
GrafenL 11984). and hence does not involve group selection 



at all. That is, rather than saying groups with more coop- 
erative individuals are fitter than groups with fewer cooper- 
ative individuals, we could equally say that individuals in 
groups with more cooperators are fitter than individuals in 
groups with fewer cooperators. In fact, our position is that 
if we could not explain the outcome of such models in terms 
of (context dependent) individual selection the result would 
be 'mystical' - that is, we would not have an evolutionary 
explanation at all. The behaviour of such models is fully 
explainable, as it must be, in terms of modified selective 
pressures on individuals given the group-living assumed. 
Nonetheless, it is at least interesting to note that the increase 
in levels of cooperation are consistent with the differential 
productivity of groups, i.e., more cooperative groups are 
fitter in terms of the genetic contribution they make to fu- 
ture generations, as well as consistent with the differential 
productivity of individuals, i.e. individuals in more coop- 
erative groups are fitter in terms of the genetic contribution 
they make to future generations (IDugafkin and ReeveLll994 



Kerr and Godfrey-Smiffi 12002 ). Indeed, this has to be the 



case because in this (very common) kind of multilevel se- 
lection model, group fitness is by defi nition the mean indi- 
vidua l fitness of the group members dDamufh and Heislei , 
19881: lOkashal [2006). However, even this pluralist posi- 



tion seems to be on shaky ground when the groups are not 
neatly defined. For example, how can we empirically mea- 
sure group phenotypes (e.g., level of cooperation within the 
group) if we cannot identify discrete groups? In this case, 
a group based account will lose accuracy whereas the in- 
dividual selection per spective remains undeniably precise 
dGodfrev-Srnith[|2006l) . 



Is the standard model relevant to natural populations? 

The standard model describes neatly partitioned sub- 
populations where the benefits of cooperative acts are dis- 
tributed equally to members within each group, but not with 
memb ers of other groups (IWilsonl Il980t iGodfrev-Smifhl 
120061) . Such idealised conditions are likely to be rare in nat- 
ural populations. Of course, the effect does not immediately 
vanish when groups are less neat. But in such cases, lo- 
calised frequency-d ependent selection seem s a perfectly ad- 
equate explanation (Mavnard Smith, 19761) . and there seems 
to be little value in arguing for a 'group selection' account. 
Moreover, even if we wanted to retain a group selection 
framework, it is not clear how we could measure and quan- 
tify the differential productivity of groups in realistic scenar- 
ios w here groups are somewhat ill defined dGodfrev-Smithi 
20061) . 

These considerations should not lead one to conclude, 



however, that there is nothing of co nsequence presented in 
the idealised models dOkasha , l2006h nor that nothing inter- 
esting can happen in natural populations. But it is a bit tricky 
to say what it is exactly, and more tricky to know how to 
measure it in a natural population. Certainly, if we were 
to assess the level of cooperation in a natural population, 
and then (assuming this were practically possible) assess it 
again in an artificially well-mixed version of the same ex- 
periment, we might see a difference in the two levels. This 
would at least tell us that localised frequency-dependent ef- 
fects were significant in this system. But frankly, it does 
not sound all that interesting - it is rather obvious that se- 
lective pressures will be different in well-mixed populations 
if locally dispersed resources or public goods are involved. 
Simply examining the global frequency of cooperation tells 
us nothing about the mechanism behind its evolution, e.g., is 
cooperation a simple mutualism or is it individually-costly? 

Moreover, although a comparison of well-mixed versus 
spatial or viscous populations is possible in synthetic sim- 
ulations, the practicalities of say, mechanically mixing a 
biofilm or adding surfactants to break-up the extra-cellular 
matrix that holds cells together would not merely alter spa- 
tial relationships, but potentially affect many important en- 
vironmental factors that could confound the result. We are 
left, therefore, with a significant gap between the theoretic 
idealisations of group selection and methodology that would 
be useful in practical situations dWest et all 12008). 

An alternative is to look for a Simpson's paradox in situ. 
A Simpson's paradox clearly e mphasises the cruc i al me - 
chanics of multilevel selection ( Sober and Wilson , 1 1998b . 
see below, and it can be measured in situ so that it does not 
require disruption of the natural population structure. 

Group selection and Simpson's paradox 

Simpson's paradox is a statistical phenomenon that arises 
when correlations or trends within sub-groups of a data set 
fail to represent t he overall corre l ation when all the data i s 
assessed together (Simpso^. 1951 ; Sober and WilsonL 19981) . 
Table 1 shows a very simple hypothetical example based on 
a group selection scenario. It shows the numbers of cooper- 
ators and selfish individuals in two groups, A and B, at two 
time points, t = 1 and t = 2. Note that both groups show a 
decrease in the proportion of cooperators in this time inter- 
val, yet overall, from the same data, there is nonetheless an 
increase in the total proportion of cooperators. 

It may be useful to clarify that at a given point in time, 
the average within-group proportion of cooperators can be 
different from the global proportion of cooperators. This is 
simply because the average within-group proportion weights 
all groups equally, whereas the global proportion is im- 
plicitly the same summation but with each group contri- 
bution 'weighted' in proportion to its size. In the exam- 
ple, at t = 1 the groups are equal sized and the average 
within-group proportion and the global proportion are there- 



fore the same. But in the second time point, the groups 
are different sizes and the average within-group proportion 
((31% + 62%)/2 = 46.5%) is not equal to the global pro- 
portion (51%). 

In this example then, the growth trend paradox (i.e., co- 
operation decreases within groups but increases globally) is 
caused by the fact that one group grows much more than the 
other. Specifically, the B group, with twice the initial pro- 
portion of cooperators, is assumed to grow at about twice 
the rate as the A group in this example. So, although self- 
ish individuals always grow faster than the cooperators in 
any given environment, some cooperators grow faster than 
some selfish individuals (specifically, when cooperators are 
in an environment of many other cooperators). Accordingly, 
because highly cooperative groups grow more, cooperators 
can increase in total proportion even though they decrease in 
proportion within each group. 

Using Simpson's paradox to indicate group 
selection 

Simpson's paradox as a basis for group selection is well un- 
derstood. However, it is generally not used as a direct indica- 
tor of group selection. Instead, the norm is simply to assess 
the global level of cooperation and see if it increases. But 
in practical experiments this is insufficient to conclude that 
group selection is responsible for such an increase. When 
the exact form of the evolutionary game that individuals are 
engaged in is unknown, due to numerous modes of interac- 
tion and multiple 'public goods' for example, or competi- 
tion for multiple resources, it can be difficult to genuinely 
ascertain whether the 'cooperator' is really cooperating and 
whether the 'selfish' type is really selfish. That is, should we 
be surprised that the global level of cooperation increases, 
or is it a simple case of mutualism? The obvious control is 
to compare with a well-mixed population or to increase the 
diffusion rate in a spatial model, but aside from the prac- 
tical difficulties of this in natural populations (even bacte- 
rial ones), this cannot maintain the 'all other things being 
equal' condition necessary to determine that only the local- 
isation of interactions is producing the difference in results. 
Instead, by looking for a divergence between the average 
within-group and global proportions of types, we can both 
verify that the types are behaving as expected (that in any 
given environment the selfish individuals have the advan- 
tage) and identify a group selection effect if there is one. 
Thus Simpson's Paradox provides an in situ measurement 
of group selection in the sense that we do not need to dis- 
rupt groups to provide a control, and can therefore assess 
the effect that groups are having merely by observing how 
the frequencies of types change in the natural population. 

To measure Simpson's Paradox in scenarios that have 
poorly defined groups requires an additional small step. For 
this we propose the following practical methodology for a 
spatially distributed population. Rather than attempt to de- 





t = 1 


t = 2 




Coop 


Selfish 


%Coop 


Coop 


Selfish 


%Coop 


A 


2 


4 


33% 


4 


9 


31% 


B 


4 


2 


66% 


16 


10 


62% 


Total 


6 


6 


50% 


20 


19 


51% 



Table 1: Numbers of cooperative and selfish individuals in two hypothetical groups, illustrating Simpson's paradox. Bold 
highlighting indicates the time point where the proportion of cooperators is highest. Note that within both group A and group 
B the proportion of cooperators decreases over this period, but overall, the proportion of cooperators increases. 



fine boundaries around one group and distinguish it from 
another, we can simply divide the physical space into equal- 
sized local regions and measure both the average local pro- 
portion of cooperators within all regions, and the global pro- 
portion of cooperators. If the selfish individuals are indeed 
selfish individuals then the average local proportion of coop- 
erators must be always declining. But if, at the same time, 
the global proportion of cooperators is increasing then there 
is significant group selection activity. 



Note that if every region exhibited approximately the 
same amount of total cell growth, then a paradox could not 
occur; but if some local regions are growing much faster 
than others (because local frequency-dependent fitness ef- 
fects are sufficiently strong) a Simpson's Paradox may be 
observed. In principle, it does not matter whether the space 
is divided into contiguous tiles (as we employ below), or 
whether regions are selected at random with random centres. 
But it does matter that regions are not selected in any manner 
that is biased by cell density, for that would amount to taking 
a weighted average. Taking a weighted average would nec- 
essarily make the local average the same as the global, and 
so would result in the local group dynamics disappearing 
fro m the analysis. Th i s is th e "averaging fallacy" described 
by ISober and Wilson! d 1 9980 . which causes the appearance 
of group selection to vanish. For example, measuring the 
proportion of cooperators in the vicinity of each and every 
cell or within its radius of influence will bias measurements 
of local proportions in such a manner that dense areas con- 
tribute more to the average in exact proportion to how dense 
they are - in this case, the average local proportion cannot 
be different from the global proportion. 



In the remainder of this paper we develop a simple 
individual-based model of bacterial growth, such as would 
apply to a locally-dispersing 'public good', to illustrate the 
use of this methodology and as a basis for discussion of sev- 
eral additional complicating factors that are important in its 
application. Of particular interest is the possibility of mea- 
suring the local proportions at several different spatial scales 
to determine the effective scale of selection. 



An individual-based model 
Bacterial Biofilms 

In developing the following model we have bacterial 
biofilms in mind. Social evolution in bacterial systems 
is currently receiving considerable attention both as a 
model system of social evoluti on and because of the prac 



tical i mplications of biofilm s dCrespil 1200 1; Griffi n et al 
' 20061) . Biofilms show a physical 



2004; Burmolleetal 



structure especially suited for localised fitness interactions 
via the formation of sem i-isolated micro-colony structures 
(iHall-Stoodley et all 120041) . However, the following model 
is general - not dependent on any of the particulars that per- 
tain to specific bacterial strains or types of fitness interac- 
tion. The vital assumptions are that there are two types of 
individual, that the presence of one of these types (but not 
the other) is beneficial to other individuals within a certain 
spatial radius, and that this type bears a cost for providing 
this benefit. For example, one type may be a wild-type strain 
of Pseudomonas Aurigenosa, that relea ses into the environ - 
ment an enzyme useful for binding iron (Gri ffin et aU l2004). 
This enzyme can be understood as a 'public good' because 
it can be used by others within the diffusion radius of the 
molecule. The other type may be a selfish mutant strain that 
does not produce the public good and is therefore not bur- 
dened by its production, but can, like any other individual, 
benefit from the public good produced by cooperators. 

Model definition 

The state of the model at any point in time is defined by 
a population of individuals each of which has a type (co- 
operate/selfish), an age, a location in continuous 2D space 
and a 'reproductive potential'. Reproductive potential can 
be thought of as the resources the individual has accumu- 
lated over time. There is no explicit modelling of the public 
good, diffusion constants, extra-cellular matrix, or such like 
- and in the default model, cells do not move. At every point 
in time, the fitness potential of each cell is incremented by 
a fitness benefit, W . This is a function of both the individ- 
ual's own type, and of the number of cooperators in the local 
vicinity. Specifically, the fitness benefit of an individual is: 



W = m + Pb-c, 



(1) 



where m = 1.5 is a constant representing the intrin- 
sic growth rate, P is the proportion of cooperators within 
a given radius, r\ = 15, of the individual (including it- 
self), b = 4 is a constant representing the fitness benefit 
received from cooperators, and c = for selfish individuals 
and c = 1.8 for cooperators is the cost of being a cooper- 
ator (i.e., the cost of producing the public good). This fit- 
ness function i s standard in evolutionary models of altruism 
dWilsonl Il980h . and amoun ts to an n-player public g oods 
game / Prisoner's Dilemma (Flet cher and ZwicM,l2007b . 

The model proceeds by updating each individual, in each 
time step, according to Algorithm 1 . 

Algorithm 1 Individual update algorithm. 

1 . The age is incremented by 1 . 

2. If the age is 5 the cell dies. 

3. Otherwise, the fitness benefit is calculated (as above) and 
added to the individual's current reproductive potential. 

4. Whilst the reproductive potential > 4, 

(a) Reproduce, placing descendant cell in a new location 
according to a placement algorithm (see text). An off- 
spring is an exact genetic clone of its parent. 

(b) Decrement reproductive potential by 4. 



The model is initialised with equal numbers of cooper- 
ators and selfish individuals distributed uniformly at ran- 
dom. Each initial cell (and new cell from reproduction) is 
initialised with reproductive potential=0, and age=0. The 
placement algorithm may take account of competition for 
space (and possibly fail to produce an offspring if space does 
not allow) but by default it simply places an individual in a 
random location within a radius, r-i — 5. Thus, an offspring 
is placed close to its parent. 

Measuring the global proportion of cooperators is triv- 
ial. To measure the average local proportion of coopera- 
tors, the space is divided into contiguous square regions of 
size, r$ = 15 (note that the area of each square local region, 
{rz) 2 = 225, in which local proportions are measured, is 
the same order of magnitude as the circular area over which 
a cooperator may affect other individuals, 7r(ri) 2 = 707. 
See Fig. 5.). 

In an advanced version of the model, cells are motile and 
move toward cooperators. This represents attraction towards 
concentration of the public good, for example. At each time 
step, a vector is calculated which is a distance-discounted 
sum of vectors to all other local regions, weighted by the 
number of cooperators in that region. The regions used are 
the same as those used for calculating the average local pro- 
portion of types. Each cell then moves a random distance d 
in the direction of this vector; d is uniformly distributed in 



the range to 15r4, where r± is a constant controlling the 
amount of movement. 

Model illustrations 

We initialised each simulation with 150 cooperators and 150 
selfish individuals, distributed randomly across a square grid 
of size 250 * 250. Each simulation was repeated 50 times, 
and the mean of both the average local and global propor- 
tions of cooperators recorded. 

Figure [2] shows that although the initial distribution of 
bacterial cells is random, the cells grow into spatial clus- 
ters due to non-motility and the fact that offspring are placed 
close to their parents (as per Model definition). 




Figure 2: Illustration of biofilm growth in the model. Green 
cells are cooperators, red are selfish cheats. 

From standard social evolution theory, we would not ex- 
pect cooperation to i ncrease or be s table in the absence of lo- 
calised interactions dWilsonl 1980h . Thus, in such cases we 
should not see a Simpson's Paradox, since without localised 
interactions there should be no difference in the growth-rates 
of different localities, ceteris paribus. We verified that this 
was the case in our model by making the radius of social 
interactions, r\, equal to the size of the whole grid. Thus, 
each individual would experience the global proportion of 
cooperation for the purposes of determining their fitness. 
This corresponds to complete mixing of the public good, 
but not of the individuals themselves. Thus, we still mea- 
sured the local proportion of cooperation across squares of 
size r3 = 15. As Figure [3al shows, the global frequency of 
cooperation steadily declines in this case, and there is no ob- 
servation of a Simpson's Paradox. This is because although 
there are still spatial groups in the system, membership of 
these groups does not affect fitness when the public good is 
global, and hence they are meaningless to evolution. This 
serves as an illustration of the fact that the groups we can 
readily observe in a system (e.g., the clusters in our model) 
may not be the same scale as the groups that matter for the 
evolution of cooperation (in the case of well-mixed public 
goods, the 'group' is the whole population). 



On the other hand, in Figure [3b] we set the radius of 
the public good to n = 15. This represents localised in- 
teractions, and so we might expect cooperation to evolve. 
Moreover, we set the window size over which we measure 
local proportions of cooperators to be of this same scale 
(V3 = 15). In this case cooperation evolves, and we observe 
a difference between average local and global proportions 
of cooperation, and hence a Simpson's Paradox. It should 
be noted that Simpson's Paradox is present even when the 
global proportion of cooperators is falling, so long as the 
average local proportion of cooperators is falling at a faster 
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Figure 3: A) When the average of interaction, r\ covers 
the entire space, cooperation does not evolve and Simpson's 
Paradox is not observed. B) When n = 15 cooperation 
evolves, and there is a difference between local and global 
proportions. C) Multiple aggregation and dispersal cycles 
with r = 15. 



rate (e.g., generations 1-6 in Figure [3bb- In this case there is 
a non-zero between-group component of selection, but this 
is weaker than within-group selection. 

Figure [3b] also illustrates that the paradox cannot be sus- 
tained indefinitely. This is because selfish individuals are 
fitter than cooperators sharing the same public good (same 
P value but c = in Equation [T). Thus, they must nec- 
essarily increase in frequency within each locality. As this 
happens, the differential growth of different localities de- 
creases, and hence the paradox reduces. In Figure [3b] the 
paradox peaks at 14 generations, after which the global fre- 
quency of cooperation starts to fall back down. This seem- 
ingly inevitable decrease in cooperation as the generations 
go by need not occur, however, if indiv iduals are periodi- 
cally mixed and redistributed in space dSober and Wilsonl 



1998). Essentially this is because such a redistribution of 



individuals reestablishes variance in the proportion of co- 
operators (and hence in the amount of the public good) 
between groups, and so once again allows for differen- 
tial group productivity to have an effect and create a para- 
dox. This is illustrated in Figure [3c] where dispersal 
from clusters and global mixing occurs every 14 genera- 
tions. These dispersal events explain the see-saw shape 
of the average local curve: at each dispersal event, the 
average local proportion is returned to the global propor- 
tion of co operators. Dispersal is known to occur in natural 
biofilms dGhannoum and O'Toolell2004l) (although simulta- 
neous and complete mixing is a simplifying assumption of 
our model), and the single-celled bottleneck in the develop- 
ment of multicellular orga nisms provides a similar redistri- 
bution of gene t ic vari ance dMavnard Smith and Szathmarv , 
Il995t iMichodl |l999). Thus, some degree of dispersal is 
likely to be import ant in maintai ning cooperation in nat- 
ural populations dWest et al. , 120021) . and may actually be 
an evolutionary adaptation at least partly for this pu rpose 
dMavnard Smith and Szammarvill995l;lMichodlll999l) . 

Figure [4] shows the effect of cell motility on the obser- 
vation of Simpson's Paradox. Again, from standard theory 
we would expect increasing motility to reduce global levels 
of cooperation. We see that increasing motility decreases 
Simpson's Paradox. This is because it increases the hetero- 
geneity of localities, making their P values more similar and 
hence the differential in group productivity lower. 

Figure [5] shows how the peak observation of a Simpson's 
Paradox changes depending on the scale at which local pro- 
portions of cooperators are measured. Observation of the 
paradox will peak when this scale corresponds to the actual 
scale of social interactions in the system, e.g., to the radius 
in which the public good is shared. The peak in Figure [5] 
is where the measured locality size corresponds, approxi- 
mately, to ri, the actual scale of interaction. Measuring 
Simpson's Paradox using different local scales could thus 
be used to determine the actual scale of social interactions 
in a real-world system, where this may well not be known a 
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Figure 4: Effect of increasing cell motility on the observa- 
tion of Simpson's Paradox. Error bars show standard devia- 
tion. 



priori. 
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Figure 5: Effect of the magnitude of the locality size mea- 
sured on the observation of Simpson's Paradox (difference 
between local average and global proportion of cooperators). 
The observed paradox is strongest when the measured local- 
ity size corresponds to the actual scale of social interaction; 
measurements were taken after the number of generations 
that yielded the peak difference between global and local 
frequencies, for each window size. Error bars show stan- 
dard deviation. The length of the error bars increases with 
the window size because a larger window size corresponds 
to fewer localities and hence fewer samples to average over. 



Discussion 

We have presented a methodology for measuring the effect 
of group-level selection in natural populations. Real-world 
populations may often not be formed of clearly observable 
groups with discrete boundaries, which makes the applica- 
tion of standard multilevel selection theory non-trivial. In 
particular, theoretical techniques for measuring the strength 
of group selection, such as the Price Equation or contextual 
analysis, rel y on being able to me asure properties of dis- 
crete groups (IGodfrev-Smitri I2006I) . Thus, their application 
to systems such as bacterial biofilms remains problematic. 

Here, we have suggested observation of Simpson's Para- 
dox as a way to quantify the effect of group-level selec- 
tion in a natural population. It is now widely appreciated 



that Simpson's Paradox, the difference between average lo- 
cal and global frequencies of cooperation, will be present 
whenever individually-co stly cooperative behaviours evolve 
riSoberandWilson[ |l998). Moreover, its presence indicates 
multi ple scales of selection in a system dSober and Wilson , 
Il998h . However, discussions of Simpson's Paradox have so 
far remained in the theoretical domain. In particular, illus- 
trations of it have, to our knowledge, only been conducted 
in models with discrete group boundaries. By contrast, we 
have shown that Simpson's Paradox can be readily mea- 
sured in populations where individuals are continuously dis- 
tributed throughout space. Thus, the exact group structure 
does not have to be known a priori for this technique to be 
applied. We have illustrated the measurement of Simpson's 
Paradox in such a case with an individual-based model of 
public goods production in bacterial biofilms. 

Significantly, measurement of Simpson's Paradox can be 
used to determine the effective group structure in a natural 
population. Specifically, the difference between average lo- 
cal and global proportions of cooperation will peak when the 
size of localities measured is of the same scale as that over 
which the public good is shared. That is, when the measure- 
ment window siz e matches the scale of fitness-affecting so- 
cial interactions. IWilsonl dl980l) terms the scale over which 
social interactions occur "trait groups". He stresses that the 
groups which matter to natural selection are subsets of in- 
dividuals in which fitness-affecting interactions occur, and 
that these subsets may not correspond to the apparent groups 
that are most readily observable in a population. For ex- 
ample, although discrete clusters may be observable in a 
biofilm, these may not correspond to the radius over which 
a public good diffuses. Varying the window size over which 
the change in local proportions of cooperators is measured, 
and looking for the peak difference with the global propor- 
tion, can identify the effective trait groups in the population. 
Searching for the trait groups in this way can be done by im- 
age analysis at the end of the experiment - the experiment 
does not have to be re-run in order to measure Simpson's 
Paradox on different scales. Regarding biofilms, one may 
also measure local proportions using regions that specifi- 
cally enclose micro-colonies to see if micro-colony struc- 
ture is a stronger selective unit than arbitrary local regions. 
That is, our methodology can be used to determine whether 
the micro-colonies correspond to trait groups, or whether the 
trait groups are in fact smaller or larger. 

In future work, it would be interesting to investigate 
whether the Price Equation can be meaningfully applied to 
the appropriate window size. In particular, our methodol- 
ogy identifies non-arbitrary groups. Thus, once we have 
identified the effective trait group size, we could calculate 
the covariance between group character (local proportion of 
cooperators), and group productivity. Likewise, we could 
calculate the covariance between individual character (co- 
operator or not) and individual fitness (number of cell di- 



visions). O ur method ology also fits w ithin a kin se l ection 
framework dHamiltonll 19641) . as used bv lGriffin et al.1 (120041) 
to study bacterial social evolution, for example. Finding the 
trait groups corresponds to finding the scale at which genetic 
relatedness should be measured in a natural population. 

Acknowledgements Thanks to Alex Penn, Jeremy Webb 
and Lex Kraaijeveld. 
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